Diffusion Model for Sketches

text-to-sketch

[1][SVG]: a) use text-to-image model to generate an image. Align image and sketch through CLIP. b) Align text and sketch through diffusion model.
[2][SVG]: a) Align text and sketch through CLIP
[3][SVG]: a) Align text and sketch through diffusion model

image-to-sketch

[4][SVG]: Align image and sketch. Use MLP to predict offsets from initial points.
[5][SVG]: Align image and sketch. Use saliency for initialization.

SVG: Scalable Vector Graphics
SDS: score distillation sampling

References

[1] Xing, Ximing, et al. “DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models.” arXiv preprint arXiv:2306.14685 (2023).

[2] Frans, Kevin, Lisa Soros, and Olaf Witkowski. “Clipdraw: Exploring text-to-drawing synthesis through language-image encoders.” Advances in Neural Information Processing Systems 35 (2022): 5207-5218.

[3] Jain, Ajay, Amber Xie, and Pieter Abbeel. “Vectorfusion: Text-to-svg by abstracting pixel-based diffusion models.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.

[4] Vinker, Yael, et al. “Clipascene: Scene sketching with different types and levels of abstraction.” Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023.

[5] Vinker, Yael, et al. “Clipasso: Semantically-aware object sketching.” ACM Transactions on Graphics (TOG) 41.4 (2022): 1-11.